Data Cleaning in SQL Server
Scroll down to the bottom of the page and select 'View Project Code' to view the SQL code on GitHub
About the Data
The dataset to be cleaned was regarding Nashville Housing Data. It's raw format was an Excel file. The more relevant data can be summarised into 3 main categories: Property Details (build year, address, value, acreage etc), Sales Details (date of purchase, price) and Owner Details (name, address).Methodology
The steps for this project are outlined below, along with the SQL techniques utilised to carry out the data cleaning task :1) Standardizing the Date Format
- DDL
- DML
- Self joins
- String functions - CHARINDEX(), SUBSTRING(), REPLACE(), PARSENAME(), LEN(), LTRIM()
- CASE expression
- CTE's
- Window functions
- DDL